Decision Problems of Tree Transducers with Origin
نویسندگان
چکیده
A tree transducer with origin translates an input tree into a pair of output tree and origin info. The origin info maps each node in the output tree to the unique input node that created it. In this way, the implementation of the transducer becomes part of its semantics. We show that the landscape of decidable properties changes drastically when origin info is added. For instance, equivalence of nondeterministic top-down and MSO transducers with origin is decidable. Both problems are undecidable without origin. The equivalence of deterministic topdown tree-to-string transducers is decidable with origin, while without origin it is a long standing open problem. With origin, we can decide if a deterministic macro tree transducer can be realized by a deterministic top-down tree transducer; without origin this is an open problem. Tree transducers were invented in the early 1970’s as a formal model for compilers and linguistics [24, 23]. They are being applied in many fields of computer science, such as syntax-directed translation [13], databases [22, 15], linguistics [19, 4], programming languages [27, 21], and security analysis [16]. The most essential feature of tree transducers is their good balance between expressive power and decidability. Bojańczyk [3] introduces (string) transducers with origin. For “regular” stringto-string transducers with origin he presents a machine independent characterization which admits Angluin-style learning and the decidability of natural subclasses. These results indicate that classes of translations with origin are mathematically better behaved than their origin-less counter parts. We initiate a rigorous study of tree transducers with origin by investigating the decidability of equivalence, injectivity and query determinacy on the following models: top-down tree-to-tree transducers [24, 23], top-down tree-to-string transducers [11], and mso definable tree-to-string transducers (see, e.g., [10]). ? The authors are grateful to Joost Engelfriet for his remarks for improvements and corrections on a preliminary of this work. This work has been carried out thanks to the support of the ARCHIMEDE Labex (ANR-11-LABX-0033) and the A*MIDEX project (ANR-11-IDEX-0001-02) funded by the ”Investissements d’Avenir” French Government program, managed by the French National Research Agency (ANR) and by the PEPS project ”Synthesis of Stream Processors” funded by CNRS. top-down tree-to-tree top-down tree-to-string mso tree-to-string det nd det nd det nd + [12] − [14] ? − [14] + [10] − with origin + + + − + + Table 1. Decidability of equivalence Unlike the string transducers of Bojańczyk [3], we will see that equivalent models of tree-to-string transducers do not remain equivalent in the presence of origin. This motivates the study of subclass definability problems (definability of a transduction from a class in a subclass) when considering the origin semantics. Table 1 summarizes our results on equivalence; non-/deterministic are abbreviated by nd/det and decidable/undecidable by +/−. The “?” marks a longstanding open problem, already mentioned by Engelfriet [7]. The first change from − to + is the equivalence of nondeterministic top-down tree transducers. In the non-origin case this problem is already undecidable for restricted stringto-string transducers [14]. In the presence of origin it becomes decidable for tree transducers, because origin implies that any connected region of output nodes with the same origin is generated by one single rule. Hence, the problem reduces to letter-to-letter transducers [1]. What about nondeterministic top-down tree-to-string transducers (column four in Table 1)? Here output patterns cannot be treated as letters. By deferring output generation to a leaf they can simulate non-origin translations with undecidable equivalence [14]. Finally, we discuss column three. Here the origin information induces a structure on the output strings: recursive calls of origin-equivalent transducers must occur in similar “blocks”, so that the same children of the current input node are visited in the same order (but possibly with differing numbers of recursive calls). This block structure allows to reason over single input paths, and to reduce the problem to deterministic tree-to-string transducers with monadic input. The latter can be reduced [20] to the famous hdt0l sequence equivalence problem. Injectivity for deterministic transducers is undecidable for all origin-free models of Table 1. With origin, we prove undecidability in the tree-to-string case and decidability in the mso and top-down tree cases. The latter is again due to the rigid structure implied by origins. We can track if two different inputs, over the same input nodes, produce the same output tree. We use the convenient framework of recognizable relations to show that the set of trees for which a transducer with origin produces the same output can be recognized by a tree automaton. Motivation. Clearly, the more information we include in a transformation, the more properties become decidable. Consider invertability: on the one extreme, if all reads and writes are recorded (under acid), then any computation becomes invertible. The question then arises, how much information needs to be included in order to be invertible. This problem has recently deserved much attention in the programming language community (see, e.g., [26]). Our work here was inspired by the very similar view/query determinacy problem. This problem asks for a given view and query, whether the query can be answered on the output of the view. It was shown decidable in [2] for views that are linear extended tree transducers, and queries that are deterministic mso or top-down transducers. For views that include copying, the problem quickly becomes undecidable [2]. Our results show that such views can be supported, if origin is included. Consider for instance a view that regroups a list of publications into sublists of books, articles, etc. A tree transducer realizing this view needs copying (i.e., needs to process the original list multiple times). Without origin, we do not know a procedure that decides determinacy for such a view. With origin, we prove that determinacy is decidable. As expected: the world becomes safer with origin, but more restrictive (e.g., the query “is book X before article Y in the original list?” becomes determined when origin is added to the above view). The tracking of origin information was studied in the programming language community, see [25]. As a technical tool it was used in [9] to characterize the MSO definable macro tree translations, and, in [17] to give a Myhill-Nerode theorem for deterministic top-down tree transducers. From a linguistic point of view, origin mappings on their own are subject of interest and are called “dependencies” or “links”. Maletti [18] shows that dependencies (i.e., origins) give “surprising insights” into the structure of tree transformations: many separation results concerning expressive power can be obtained on the level of dependencies.
منابع مشابه
TREE AUTOMATA BASED ON COMPLETE RESIDUATED LATTICE-VALUED LOGIC: REDUCTION ALGORITHM AND DECISION PROBLEMS
In this paper, at first we define the concepts of response function and accessible states of a complete residuated lattice-valued (for simplicity we write $mathcal{L}$-valued) tree automaton with a threshold $c.$ Then, related to these concepts, we prove some lemmas and theorems that are applied in considering some decision problems such as finiteness-value and emptiness-value of recognizable t...
متن کاملA New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining
Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...
متن کاملMMDT: Multi-Objective Memetic Rule Learning from Decision Tree
In this article, a Multi-Objective Memetic Algorithm (MA) for rule learning is proposed. Prediction accuracy and interpretation are two measures that conflict with each other. In this approach, we consider accuracy and interpretation of rules sets. Additionally, individual classifiers face other problems such as huge sizes, high dimensionality and imbalance classes’ distribution data sets. This...
متن کاملComparison of Decision Tree and Naïve Bayes Methods in Classification of Researcher’s Cognitive Styles in Academic Environment
In today world of internet, it is important to feedback the users based on what they demand. Moreover, one of the important tasks in data mining is classification. Today, there are several classification techniques in order to solve the classification problems like Genetic Algorithm, Decision Tree, Bayesian and others. In this article, it is attempted to classify researchers to “Expert” and “No...
متن کاملComparison of Decision Tree and Naïve Bayes Methods in Classification of Researcher’s Cognitive Styles in Academic Environment
In today world of internet, it is important to feedback the users based on what they demand. Moreover, one of the important tasks in data mining is classification. Today, there are several classification techniques in order to solve the classification problems like Genetic Algorithm, Decision Tree, Bayesian and others. In this article, it is attempted to classify researchers to “Expert” and “No...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015